71 research outputs found

    Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

    Full text link
    Transferring the knowledge of large language models (LLMs) is a promising technique to incorporate linguistic knowledge into end-to-end automatic speech recognition (ASR) systems. However, existing works only transfer a single representation of LLM (e.g. the last layer of pretrained BERT), while the representation of a text is inherently non-unique and can be obtained variously from different layers, contexts and models. In this work, we explore a wide range of techniques to obtain and transfer multiple representations of LLMs into a transducer-based ASR system. While being conceptually simple, we show that transferring multiple representations of LLMs can be an effective alternative to transferring only a single representation.Comment: Submitted to ICASSP 202

    Coded-MPMC: One-to-Many Transfer Using Multipath Multicast With Sender Coding

    Get PDF
    One-to-many transfers in a fast and efficient manner are essential to meet the growing need for duplicating, migrating, or sharing bulk data among servers in a datacenter and across geographically distributed datacenters. Some existing works utilize multiple multicast trees for a one-to-many transfer request to increase network link utilization and its transfer throughput. However, since those schemes do not fully utilize the max-flow value of transmission from a single sender to each recipient, there is room for each recipient to retrieve data more quickly. Therefore, assuming fully-controlled networks with full-duplex links, we pose a problem to find a set of multicast flows with an allocation of block-wise transmissions by which each of multiple recipients with diverse max-flow values from the sender can utilize its own max-flow value. Based on that, assuming a sender-side coding capability on file blocks, we design a schedule of block transmissions over multiple phases by which each recipient can achieve a lower-bound of its file retrieval completion time, i.e., the file size divided by its own max-flow value. This paper presents the coded Multipath Multicast (Coded-MPMC) for one-to-many transfers with heuristic procedures to find a desired set of multicast flows on which block transmissions are scheduled. Through extensive simulations on large-scale real-world network topologies and different types of randomly-generated synthetic topologies, the proposed method is shown to design a desired schedule efficiently. A preliminary implementation on OpenFlow is also reported to show the fundamental feasibility of Coded-MPMC

    Experiments of Multipath Multicast One-to-many Transfer with RS coding over Wide-Area OpenFlow Testbed Network

    Get PDF
    The importance of fast and efficient one-to-many transfers of a large file is increasing to replicate, move, or share bulk data not only intra a datacenter but also inter geographically distributed datacenters. We previously proposed the Coded Multipath Multicast (Coded-MPMC) one-to-many file transfer method in which the multicast transfer, the multipath transfer, and Reed-Solomon (RS) coding are integrated. This method aims to minimize the retrieval completion time of each recipient by simultaneously transmitting blocks of a file on multiple paths from a single sender to each recipient to maximize the aggregated flow value, i.e., to realize the max-flow to the recipient. We preliminarily implemented Coded-MPMC with OpenFlow protocol; however we only tested its feasibility over a small homogeneous in-lab OpenFlow network. In this paper, through experiments on a wide-area OpenFlow testbed network, we show that Coded-MPMC correctly works in a heterogeneous and geographically-distributed network. The results suggest the practicability and potential benefits of Coded-MPMC in real networks.2020 International Conference on Emerging Technologies for Communications (ICETC2020), December 2-4, 2020, Online, Virtual Conferenc

    English Broadcast News Speech Recognition by Humans and Machines

    Full text link
    With recent advances in deep learning, considerable attention has been given to achieving automatic speech recognition performance close to human performance on tasks like conversational telephone speech (CTS) recognition. In this paper we evaluate the usefulness of these proposed techniques on broadcast news (BN), a similar challenging task. We also perform a set of recognition measurements to understand how close the achieved automatic speech recognition results are to human performance on this task. On two publicly available BN test sets, DEV04F and RT04, our speech recognition system using LSTM and residual network based acoustic models with a combination of n-gram and neural network language models performs at 6.5% and 5.9% word error rate. By achieving new performance milestones on these test sets, our experiments show that techniques developed on other related tasks, like CTS, can be transferred to achieve similar performance. In contrast, the best measured human recognition performance on these test sets is much lower, at 3.6% and 2.8% respectively, indicating that there is still room for new techniques and improvements in this space, to reach human performance levels.Comment: \copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Behavior-level Analysis of a Successive Stochastic Approximation Analog-to-Digital Conversion System for Multi-channel Biomedical Data Acquisition

    Full text link
    In the present paper, we propose a novel high-resolution analog-to-digital converter (ADC) for low-power biomedical analog frontends, which we call the successive stochastic approximation ADC. The proposed ADC uses a stochastic flash ADC (SF-ADC) to realize a digitally controlled variable-threshold comparator in a successive-approximationregister ADC (SAR-ADC), which can correct errors originating from the internal digital-to-analog converter in the SAR-ADC. For the residual error after SAR-ADC operation, which can be smaller than thermal noise, the SF-ADC uses the statistical characteristics of noise to achieve high resolution. The SF-ADC output for the residual signal is combined with the SAR-ADC output to obtain high-precision output data using the supervised machine learning method
    corecore